A clustering-based discretization for supervised learning
نویسندگان
چکیده
منابع مشابه
Based on Similarity Metric Learning for Semi-Supervised Clustering
Semi-supervised clustering employs a small amount of labeled data to aid unsupervised learning. The focus of this paper is on Metric Learning, with particular interest in incorporating side information to make it semi-supervised. This study is primarily motivated by an application: face-image clustering. In the paper introduces metric learning and semi-supervised clustering, Similarity metric l...
متن کاملSemi-supervised Zero-Shot Learning by a Clustering-based Approach
In some of object recognition problems, labeled data may not be available for all categories. Zero-shot learning utilizes auxiliary information (also called signatures) describing each category in order to find a classifier that can recognize samples from categories with no labeled instance. In this paper, we propose a novel semi-supervised zero-shot learning method that works on an embedding s...
متن کاملOptimal Multiple Intervals Discretization of Continuous Attributes for Supervised Learning
5, av Pierre Mend&s-France 69676 BRON CEDEX FRANCE {zighed,rakotoma,ffeschet)@univ-lyon2.fr In this paper, we propose an extension of Fischer’s algorithm to compute the optimal discretization of a continuous variable in the context of supervised learning. Our algorithm is extremely performant since its only depends on the number of runs and not directly on the number of points of the sample dat...
متن کاملLearning Kernels for Semi-Supervised Clustering
As a recent emerging technique, semi-supervised clustering has attracted significant research interest. Compared to traditional clustering algorithms, which only use unlabeled data, semi-supervised clustering employs both unlabeled and supervised data to obtain a partitioning that conforms more closely to the user's preferences. Several recent papers have discussed this problem (Cohn, Caruana, ...
متن کاملSemi-Supervised Learning for Web Text Clustering
Supervised learning algorithms usually require large amounts of training data to learn reasonably accurate classifiers. Yet, for many text classification tasks, providing labeled training documents is expensive, while unlabeled documents are readily available in large quantities. Learning from both, labeled and unlabeled documents, in a semi-supervised framework is a promising approach to reduc...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Statistics & Probability Letters
سال: 2010
ISSN: 0167-7152
DOI: 10.1016/j.spl.2010.01.015